DTE AICOMAS 2025

Opening the black-box: approximation and generalization properties of convolutional neural networks in surrogate modeling

Franco, Nicola Rares (Politecnico di Milano)
Fresca, Stefania (Politecnico di Milano)
Brugiapaglia, Simone (Concordia University)
Manzoni, Andrea (Politecnico di Milano)
Zunino, Paolo (Politecnico di Milano)

In session: MS025A - Mathematical Aspects of Machine Learning Methods in Computional Mechanics I

Please login to view abstract download link

Surrogate models are numerical emulators that aim at approximating the behavior of complex systems at a reduced computational cost. They play a vital role in fields such as uncertainty quantification and optimal control, particularly in scenarios involving parametrized partial differential equations (PDEs), where they serve as a cheaper alternative to expensive numerical solvers. Recently, Deep Learning has become increasingly popular in this context, providing researchers with new powerful data-driven approaches to surrogate modeling, from DeepONets and Fourier Neural Operators to models based on deep convolutional autoencoders. In this talk I will focus on the latter class of approaches, with a major emphasis on Convolutional Neural Networks (CNNs). In particular, by casting the problem in the framework of operator learning, I will present suitable error bounds that illustrate the role played by each hyperparameter in a convolutional architecture, ultimately unveiling how -and why- CNNs actually work. To this end, I will first present an analysis of the sole approximation error, ignoring model training and optimization; then, I will present some novel results that aim at bridging this gap, discussing practical issues, such as the choice of the training size and the loss function. Finally, I will conclude with a brief discussion about the broader picture, mentioning how surrogate models can be combined with numerical solvers to optimally manage computational resources, and how prior knowledge can help us in constructing physically consistent architectures (ensuring, e.g., conservation of mass and angular momentum).

DTE AICCOMAS 2025

Opening the black-box: approximation and generalization properties of convolutional neural networks in surrogate modeling